Skip to content

docs: define sensitivity and protection method#151

Open
asteier2026 wants to merge 5 commits into
mainfrom
asteier2026/docs/sensitivity
Open

docs: define sensitivity and protection method#151
asteier2026 wants to merge 5 commits into
mainfrom
asteier2026/docs/sensitivity

Conversation

@asteier2026
Copy link
Copy Markdown
Contributor

Changes include:

  • Documentation for how sensitivity and protection method are assigned.

@asteier2026 asteier2026 requested a review from a team as a code owner May 11, 2026 16:02
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 11, 2026

Greptile Summary

This PR adds a new "Key concepts" section to docs/concepts/rewrite.md that formally defines Sensitivity and Protection method — two core concepts referenced throughout the rest of the document. The tables satisfy the previous review feedback requesting both discrete sensitivity levels and a structured breakdown of protection method values.

  • Sensitivity table introduces the high/medium/low levels with their leakage weights (1.0 / 0.6 / 0.3), which are now properly anchored before their first use in the leakage mass formula later in the doc.
  • Protection method table lists all five methods (replace, generalize, suppress_inference, remove, leave_as_is) with descriptions and typical-use guidance, matching the exact enum values used in the codebase.

Confidence Score: 5/5

Documentation-only change with no code modifications; safe to merge.

The change adds a purely descriptive section to an existing doc file. The sensitivity weights and protection method names are consistent with the values already used in the leakage-mass formula and output-columns table elsewhere in the document, and they match the enum values in the codebase.

No files require special attention.

Important Files Changed

Filename Overview
docs/concepts/rewrite.md Documentation-only addition of a "Key concepts" section; content is accurate and consistent with the rest of the document and the codebase enum values.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Entity detected] --> B[Sensitivity assigned\nby disposition_analyzer]
    B --> C{Sensitivity level}
    C -->|high — weight 1.0| D[Direct identifier\nNames, IDs, contact details]
    C -->|medium — weight 0.6| E[Quasi-identifier\nLocation, occupation, age]
    C -->|low — weight 0.3| F[Generic attribute\nWidely shared traits]

    D & E & F --> G[Protection method chosen\nholistic document view]
    G --> H{Method}
    H --> I[replace\nSynthetic alternative]
    H --> J[generalize\nBroader form]
    H --> K[suppress_inference\nRewrite surrounding text]
    H --> L[remove\nDelete entity]
    H --> M[leave_as_is\nNo change needed]

    I & J & K & L & M --> N[Leakage scoring\nleakage_mass = Σ weight × confidence]
Loading

Reviews (5): Last reviewed commit: "fix: add more detail and organization to..." | Re-trigger Greptile

Comment thread docs/concepts/rewrite.md
Comment thread docs/concepts/rewrite.md Outdated
Comment thread docs/concepts/rewrite.md Outdated
Comment thread src/anonymizer/engine/schemas/rewrite.py Outdated
asteier2026 and others added 3 commits May 13, 2026 08:43
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Comment thread docs/concepts/rewrite.md
|-------|---------|---------|----------------|
| `high` | Exposure alone can identify a person | Names, ID numbers, contact details | 1.0 |
| `medium` | Meaningfully narrows the identity space | Location, occupation, age | 0.6 |
| `low` | Minimal standalone identifying power | Generic attributes, widely shared traits | 0.3 |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add gender here as an example as well!

Comment thread docs/concepts/rewrite.md
| `generalize` | Replaces the entity with a broader form | Quasi-identifiers (exact date → quarter, city → region) |
| `suppress_inference` | Rewrites the surrounding text to remove cues that enable the inference | Latent entities that are implied rather than stated |
| `remove` | Deletes the entity entirely | Cases where neither replacement nor generalization can preserve meaning without retaining the identifying detail |
| `leave_as_is` | Leaves the entity unchanged | Entities judged not to require protection in context |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use gender as an example here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants